Learning Probabilistic Dependency Grammars from Labeled Text
نویسنده
چکیده
We present the results of experimenting with schemes for learning probabilistic dependency grammars1 for English from corpora labelled with part-of-speech information. We intend our system to produce widecoverage grammars which have some resemblance to the standard 2 context-free grammars of English which grammarians and linguists commonly exhibit as exampies.
منابع مشابه
Covariance in Unsupervised Learning of Probabilistic Grammars
Probabilistic grammars offer great flexibility in modeling discrete sequential data like natural language text. Their symbolic component is amenable to inspection by humans, while their probabilistic component helps resolve ambiguity. They also permit the use of well-understood, generalpurpose learning algorithms. There has been an increased interest in using probabilistic grammars in the Bayes...
متن کاملInducing Tree-Substitution Grammars
Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring that it can be readily learned from data. The majority of existing work on grammar induction has...
متن کاملTwo Experiments on Learning Probabilistic Dependency Grammars from Corpora
Introduction We present a scheme for learning prohabilistic dependency grammars from positive training examples plus constraints on rules. In particular we present the results of two experiments. The first, in which the constraints were minimal, was unsuccessful. The second, with significant constraints, was successful within the bounds of the task we had set. We will explicate dependency gramm...
متن کاملApproaches for Learning Constraint Dependency Grammar from Corpora
This paper evaluates two methods of learning constraint dependency grammars from corpora: one uses the sentences directly and the other uses subgrammar expanded sentences. Learning curves and test set parsing results show that grammars generated directly from sentences have a low degree of parse ambiguity but at a cost of a slow learning rate and less grammar generality. Augmenting these senten...
متن کاملComputational Learning of Probabilistic Grammars in the Unsupervised Setting
With the rising amount of available multilingual text data, computational linguistics faces an opportunity and a challenge. This text can enrich the domains of NLP applications and improve their performance. Traditional supervised learning for this kind of data would require annotation of part of this text for induction of natural language structure. For these large amounts of rich text, such a...
متن کامل